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Abstract 

We search for the decay —>■ with £'^ = e"*" or using the full Belle data set of 772 x pairs, collected at 

the T(4S) resonance with the Belle detector at the KEKB asymmetric-energy e''"e“ collider. We reconstruct one B meson in 
a hadronic decay mode and search for the B"*" —>■ vt.') decay in the remainder of the event. We observe no significant signal 
within the phase space of > 1 GeV and obtain upper limits oiB{B^ —>■ e^Ue^) < 6.1 x 10“®, B{B^ —>■ < 3.4 x 10“®, 

and B{B^ —>■ < 3.5 x 10“® at 90% credibility level. 

PACS numbers: 13.20.He, 14.40.Nd 


I. INTRODUCTION 


respectively. The form factors are given by 


The semileptonic decay —)• [2] proceeds via 

a bu annihilation into a W~^ boson that decays into a 
lepton-neutrino pair. This is accompanied by a photon 
emission from one of the participating charged particles 
with emission from the up quark being the dominant con¬ 
tribution. The decay can be computed in Heavy Quark 
Effective Theory [3], which is valid for a high energetic 
photon emission above the QCD scale of ^qcd- 

The resulting decay amplitude depends on the first in¬ 
verse moment duj^B+{uj)/uj, where $^+( 0 ;) is 

the B meson light-cone distribution amplitude in the high 
energy limit. This parameter is an important input to 
the QCD factorization scheme used in non-leptonic B 
decay amplitudes [1] ; a tighter limit on — or, a fortiori, 
a measurement of Xb would improve the predictions for 
all of these processes. To produce consistent results for 
color-suppressed modes in non-leptonic B decays, values 
of roughly As « 200 MeV are needed. The parameter 
cannot be calculated reliably by theory and thus has to 
be measured experimentally. The decay discussed in this 
Letter is advantageous since no additional unknown pa¬ 
rameters are needed for its calculation in leading order. 


The branching fraction of the decay —)• is 

expected to be larger than that of the purely leptonic 
—7> decay as the photon removes the helicity 

suppression of the process, thus enhancing the weak de¬ 
cay amplitude. This effect is diminished by the additional 
electromagnetic coupling introduced by the photon emis¬ 
sion. The —>• decay has been calculated up to 

first-order corrections in I/to;, and radiative corrections 
at next-to-leading logarithmic order [3]. The differential 
branching fraction is given by 
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with Xj = 2E^/mB- Here, rriB is the B meson mass, Gp 
the Fermi coupling constant, Vub the CKM matrix ele¬ 
ment, and Fa and Fy the axial and vector form factors. 
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where Qe^u,b are the charges of the lepton, up quark, and 
bottom quark, respectively, /b is the decay constant for 
the B meson, and R{E^ , /r) is the radiative correction 
calculated at the energy scale fi. The first term in the 
form factors containing Xb represents the leading-order 
contribution of the QCD heavy-quark expansion describ¬ 
ing the photon emission by the light quark. The lead¬ 
ing order term is corrected for higher-order radiative ef¬ 
fects, with the R{E^,fj,) factor containing mass correc¬ 
tions for the up quark. The remaining terms in square 
brackets are l/rub power corrections which are: higher- 
order contributions for the hard and soft photon emission 
of the up quark (Qu and the ^(if.^)-term, respectively); 
the photon emission by the b quark, which is suppressed 
due to its higher mass (Q^-term); and the photon emis¬ 
sion by the lepton, which is only present in the axial 
form factor (Q^-term). The radiative corrections con¬ 
tained in R(E^,Y) reduce the leading-order amplitude 
by about 20 — 25%. The remaining 1/mt, power correc¬ 
tions have considerable parametric uncertainties. How¬ 
ever, using central values for the parameters the power- 
suppressed terms reduce the decay amplitude by about 
half the amount of the radiative corrections. The soft cor¬ 
rection for the light quark ^{E^) constitutes the largest 
uncertainty in the form factors and it has been calculated 
in Ref. to a higher precision. 

The most stringent limits for the decay process have 
been reported by the BaBar collaboration [5] at 90% 
confidence level with B{B^ —>• < 17 x 10“®, 

B{B+ /r+ 1 /^ 7 ) < 26 X 10-®, B{B+ l+va) < 
15.6 X 10“®, and a partial branching fraction 
—>■ d.'^vi'y) < 14 X 10“® for photons with en¬ 
ergies higher than 1 GeV. For the preferred value of 
Xb ~ 200 MeV, a Standard Model branching fraction of 
B{B^ —)• « 0(10“®) is expected [5]. 
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II. DATA SAMPLE AND SIMULATION 

This study uses a sample of (771.6 ± 10.6) x 10® BB 
pairs, which corresponds to an integrated luminosity of 
711 fb“^ collected with the Belle detector at the KEKB 
asymmetric-energy e“'"e“ collider |7]. The collider oper¬ 
ates at the T (4S) resonance with a center-of-mass energy 
of 10.58 GeV/c^, where the resonance decays almost ex¬ 
clusively to BB pairs. 

The Belle detector is a large-solid-angle magnetic spec¬ 
trometer that consists of a silicon vertex detector, a 50- 
layer central drift chamber (CDC), an array of aerogel 
threshold Cherenkov counters (ACC), a barrel-like ar¬ 
rangement of time-of-flight scintillation counters (TOF), 
and an electromagnetic calorimeter (ECL) comprising 
Csl crystals located inside a superconducting solenoid 
coil that provides a 1.5 T magnetic held. An iron flux 
return located outside the coil is instrumented to detect 
mesons and to identify muons (KLM). A detailed 
description of the Belle detector can be found in Ref. [5] . 

The analysis procedure is determined using Monte 
Carlo (MC) samples that are simulated with the Evt- 
Gen software package [9] followed by detector simula¬ 
tion performed with GEANT3 [TU] ■ Beam background is 
recorded by the experiment and added to each event in 
the simulated MC. Samples of 2 x 10® events are gener¬ 
ated for each signal MC channel, where the latest theo¬ 
retical calculation [3] is implemented as a decay model in 
EvtCen. Different samples with high integrated luminos¬ 
ity are used to estimate the background. A MC sample 
containing resonant charmed BB events with & —>■ c de¬ 
cays contains ten times the integrated luminosity of the 
data sample. Non-resonant e+e” —>■ qq{q = u,d,s,c) 
continuum processes are included in a MC sample with 
six times the integrated luminosity of the data sample. A 
semileptonic b —>■ ut^v^ sample with 20 times the statis¬ 
tics of the data contains the important background pro¬ 
cesses of B^ —>■ and B'^ —)■ t^vgr]. For the latter 

two processes, high statistics MC is produced with about 
100 times the size of the data sample. A final sample con¬ 
tains rare 6 —>■ s transitions and additional processes with 
50 times the integrated data luminosity. 

III. HADRONIC R-TAGGING 

As the neutrino of the signal decay cannot be de¬ 
tected, the full reconstruction technique provides strong 
constraints on the kinematics of the signal decay. The 
hadronic full reconstruction at Belle is a hierarchical re¬ 
construction scheme of one of the two B mesons (tag-side 
Stag meson) m in the event. 

The charged Stag meson candidate is reconstructed in 
one of 17 final states: D''*'>X\iad (7 states), 

(4 states), D^K^, Zl“7r+7r+, J/ipK^, tt+tt”), and 
J/tjjKgTr'^, where X^ad is a set of selected states with 
one to four pions, of which one can be neutral. The J/^/> 
particles are reconstructed from e+e“ or decays. 


Two charged tracks are used to reconstruct a Kg can¬ 
didate whose mass must be within a 30 MeV/window 
around the nominal Kg mass. Neutral pions are recon¬ 
structed from pairs of photons, each with an energy of 
at least 30 MeV and an invariant mass within 19 MeV jc? 
of the nominal pion mass. Photons are identified as en¬ 
ergy depositions in the calorimeter above 20 MeV with¬ 
out an associated track. Charged tracks are identified 
as pions or kaons using a likelihood ratio constructed 
from CDC, ACC, and TOF information. Charged-track 
quality is improved by requiring that \dz\ < 4.0cm and 
dr < 2.0 cm, where \dz\ and dr are the distances of clos¬ 
est approach of the track to the interaction point along 
the beam axis and in the transverse plane, respectively. 

The efficiency of the Rtag full reconstruction depends 
on the complexity of the decay of the signal-side B meson. 
For the simple B'^ —)■ process, a relatively high 

efficiency of 0.6% is found in the signal MC for correctly 
reconstructed Stag candidates; for 6 —>■ c processes, the 
efficiency lies around 0.2%. 

The full reconstruction contains a separate neural net¬ 
work (NN) for each particle type and decay mode and 
is trained with the NeuroBayes software [12] • Impor¬ 
tant input variables for the NN output of the final Stag 
meson include: the network outputs of the daughter 
particles; the reconstructed masses of the daughters; 
AE = — Abeam, which is the difference between 

the -Stag candidate energy and the beam energy in the 
center-of-mass system (CMS); the mass difference be¬ 
tween M{D*) and M{D); the angles between the daugh¬ 
ters in the -Stag meson rest frame; the momentum of 
the daughters in the lab frame; and cos 0 b, the cosine 
of the angle between the beam and the B^^g direction. 
The network output can be interpreted as the probabil¬ 
ity that the Atag candidate is correctly reconstructed, 
which means all particle hypotheses of the decay chain 
are correct. In the case of multiple Atag candidates, the 
candidate with the highest network output is selected. 

For the network output, differences between data and 
MC have been observed [T3|; Atag decay modes with at 
least two pions in the final state show the largest devi¬ 
ation. In charmed semileptonic signal-side A decays the 
efficiency in MC is overestimated by approximately one 
third. From that, a correction factor depending on the 
hadronic tag-side decay channel is obtained, and it is ap¬ 
plied to all MC samples used in the analysis. 

For the analysis, additional event shape variables are 
added to the network training. The variables are used to 
discriminate between spherical BB and jet-like qq contin¬ 
uum processes. The event shape variables are modified 
Fox-Wolfram moments m and the thrust axis of the 
Atag meson candidate. 
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IV. SELECTION 

A. Missing mass 

With the i^tag candidate three-momentum p^tag’ 
four-momentum of the signal-side i?sig meson in the CMS 
is given by = (£'beam/c, This makes use 

of the two-body decay kinematics of the T(4S) and the 
measured CMS boost of the BB system. The i?sig four- 
momentum is used to compute the squared missing mass, 
which is the strongest discriminator between signal and 
background. The variable is defined as 

wLss = (PBaig -Pe-P'r flc^, 

where the four-momenta of the daughter lepton and pho¬ 
ton are subtracted from that of the Bgig candidate. For 
correctly reconstructed signal events, the variable corre¬ 
sponds to the neutrino mass and therefore peaks around 
zero. The resolution of this signal peak is improved by 
using ifbeam instead of ifstag -PSaig- additional im¬ 
provement in resolution is achieved for B~^ —>■ de¬ 

cays by taking bremsstrahlung into account: the four- 
momentum of the signal electron candidate is corrected 
by the addition up to photon below an energy of 1 GeV 
within a five degree cone around the direction of its 
momentum. For the signal extraction, the region with 
’^miss ^ (—2.0,4.0) GeV^/c"^ around the signal peak is 
used. 

The analysis begins with a selection with high signal 
efficiency and purity, followed by a signal-yield extraction 
with a fit to the missing mass in bins of a NN output. 
The number of network-output bins as well as the selec¬ 
tion of variables used in the training of the network are 
optimized for signal significance. With the exception of 
the lepton identification (ID), the selection is identical 
for both —>• e+iZey and B~^ —>■ 

B. Tag-side selection 

For the Stag candidate, the beam-energy-constrained 

mass Mbc = - Pitag is required to 

be greater than 5.27GeV/c^. A selection of 
AA e (—0.15,0.10) GeV is applied; this variable is 
not used elsewhere since it is strongly correlated with 
the missing mass. A loose selection on the network 
output of the fully reconstructed Stag meson is chosen 
to have a probability above 2 x 10“^ of being correctly 
reconstructed. 


C. Signal-side selection 

After hadronic tag-side reconstruction, one charged 
track and one high-energy photon are expected in the de¬ 
tector. No additional charged tracks beyond the signal’s 


lepton daughter are permitted. The signal-side charged- 
track selection demands the same selection for the impact 
parameters as the tag-side: dr < 2 cm and |dz| < 4 cm. 
The charge of the signal lepton candidate is required to 
be opposite that of the Stag- Gurling tracks, which can 
be counted twice, are taken into account on the signal 
side by counting two tracks as one if the cosine of the 
angle between them is above 0.999 and their transverse 
momentum differs by less than 30 MeV/c. 

Electrons are identified from a likelihood formed with 
information from multiple detectors: the energy loss in 
the GDG; the ratio of energy deposition in the EGL to 
the track momentum; the shower shape in the EGL; the 
matching of the charged track to the shower position in 
the EGL; and the photon yield in the AGG m- Muons 
are identified from charged tracks extrapolated to the 
outer detector; the difference between the expected and 
measured penetration depth of the track as well as the 
transverse deviation of KLM hits from the extrapolated 
track are used to distinguish muons from hadrons [16) . 
Adding the particle ID to the final selection, 95% (99%) 
of events with a wrong-lepton hypothesis are vetoed with 
a reduction in signal selection efficiency of about 2% 
(1.2%) for the muon (electron) channel. 

The analysis is performed with two energy thresholds 
of 1 GeV and 400 MeV for the signal photon candidate in 
the i?sig rest frame, where the most energetic photon in 
the Bgig rest frame is identified as the signal photon can¬ 
didate. The 1 GeV threshold is a lower bound for which 
the theoretical model is valid; however, a secondary anal¬ 
ysis covering a larger phase space is performed, with a 
400 MeV bound chosen to remove the divergent part in 
the decay model at lower energies. The missing momen¬ 
tum in the event | p | has to be above 800 MeV/c in the 
Bsig rest frame, to be consistent with the presence of a 
high energy neutrino. 

Events in which a signal photon candidate is mis- 
reconstructed from bremsstrahlung radiation originating 
from the signal electron are vetoed by requiring that the 
cosine of the angle between the lepton and photon can¬ 
didates in the Bgig rest frame (cos0.yf) lie below 0.6. 
Eor the cosine of the angle between the missing mo¬ 
mentum and the signal photon candidate in the Bsig 
rest frame (cos©-^;,) a discrepancy is observed between 
MG and data for values below —0.9 in the sideband of 
Mbc < 5.27GeV/c^; therefore cos0.y;, is selected to be 
larger than —0.9. The remaining energy in the EGL is 
the summed energy of clusters not associated with signal 
or tag-side particles and is required to be below 900 MeV. 
Here, clusters are required to have energies above of 50, 
100, and 150 MeV for the barrel, forward, and backward 
end-cap calorimeter, respectively. These energy thresh¬ 
olds with directional dependence are proven to veto back¬ 
ground in the detector not related to physical processes. 

To suppress the main background of B~^ —>• 
decays, a tt® veto is constructed that combines the signal 
photon candidate with all remaining photons in the EGL 
above an energy of 100 MeV to compute the invariant 
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(a) Electron channel 




(b) Muon channel 


FIG. 1. (color online) Distributions of on data (points with error bars) in bins of the network output. The PDFs are for 

signal (solid blue), enhanced signal (dashed violet), fixed B —> backgrounds (dash-dotted green), fitted backgrounds 

(dotted red), and the sum (solid black). The enhanced signal function, which has the same normalization for each bin, 
corresponds to a branching fraction of 30 x 10~®. The most signal-like bin is found in the upper left panel. Proceeding from left 
to right, the distributions become increasingly more background-like and the most background-like bin is shown in the lower 
right panel. 


mass, where only the candidate closest to the nominal 
7r° mass is kept. A 7r° mass is only computed if at least 
one remaining photon above an energy of 100 MeV is left 
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in the ECL. The number of events with a computed 
mass decreases with a rising energy threshold, as does the 
number of events vetoed by a selection on the resulting 







































































































































































































































FIG. 2. (color online) Unbinned distribution where 

the enhanced signal corresponds to a branching fraction of 
30 X 10“®. 




Network output 
(b) Muon channel 

FIG. 3. (color online) Network outputs used for binning 
where the bin boundaries are indicated by the dashed lines. 
The normalizations of the MG distributions are taken from 
the fit results in mJhisa and the enhanced signal corresponds 
to a branching fraction of 30 x 10“®. 


mass spectrum. On the other hand, an increasing energy 
threshold improves the signal and background separation 
since fewer photons are combined with the signal photon 
candidate. This reduces the possibility of calculating a 
7r° mass close to the correct one by chance. The 100 MeV 
threshold is chosen to ensure a high signal efficiency of 
about 99% while achieving a good background rejection 
of 45% for —)■ processes, when a window of 

30MeV/c^ around the nominal 7r° mass is vetoed. 

The overall signal selection efficiency after full recon¬ 
struction is 47% (45%) for the muon (electron) channel. 
The expected event numbers from the background MC 
samples are: 328 (299) for & —c decays, 78 (76) for 
b —>■ decays, and 17 (6) events from non-resonant 

qq {u,d,s,c) processes for the muon (electron) chan¬ 
nel. The contribution from 5 —> s processes is found to 
be negligible. 


D. Neural network training 

To further optimize the signal selection, another NN 
is formed with the NeuroBayes package [H]. This soft¬ 
ware computes each input variable’s significance from the 
training; this is used to retain only the most significant 
variables in the network. The variables included in the 
training are: the extra energy in the ECL, cos and 
cos0.yi/. To further separate the main background pro¬ 
cesses of B+ —>■ and B'^ where the 7r° 

and ?7 decay into two photons and one of the photons is 
misidentified as the signal photon, meson-veto variables 
are incorporated into the network. These are computed 
in the same way as for the selection above but with dif¬ 
ferent energy thresholds on the remaining photons in the 
ECL. 

The thresholds are increased in 10 MeV steps from 20 
to 100 MeV. The number of photons combined with the 
signal photon candidate depends on this energy thresh¬ 
old, and since only the combination closest to the nominal 
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mass is taken into account, different photon combinations 
end up in the mass spectrum. This leads to different in¬ 
variant mass spectra with complementary information. 
The rj invariant mass is computed in the same way, with 
energy thresholds between 20 and 300 MeV. Only the six 
most significant meson masses are retained in the train¬ 
ing. 

Signal MC samples of both signal channels are trained 
simultaneously against the b —)■ ut^ vi MC and the high- 
luminosity —)■ MC sample. For the secondary 

analysis with >400 MeV, the angles cos0.y^ and 
cos Q-yi, are excluded from the training to reduce the sig¬ 
nal model dependence of the result. 

V. SIGNAL EXTRACTION 

A. Fit model 

The signal yield is determined by an extended un¬ 
binned maximum likelihood fit to the distribution 

in six bins of the NN output. The likelihood function is 
given by 

Wtot Wc '1 

In^ = ^ln|^iV,iP,(m^i,„nout)} - 

j = l i i 

where Vtot is the total number of events in the data set, 
Vc denotes the number of components in the fit, Ni is 
the number of events for the i**' component, and Vi rep¬ 
resents the probability density function (PDF) for that 
component as a function of and the network output 

^OUt • 

The fit model consists of three components: 
—)• signal; measured b —)■ vg, decays 

referred to hereinafter as the B —> vg component; 

and a component denoted as “fitted background” that 
includes unmeasured b —>■ ut^vg contributions, resonant 
b ^ c decays, and non-resonant qq processes. In the 
fit to data, the expected yield of the B Xui~^t^g 
component containing the known decay modes with 
= 7r°, T], cu, 7r+, and rj' is fixed according to 
the world average values of the branching fractions HZ]. 
The shapes of the three components are determined from 
MC in each network output bin separately and fixed in 
the fit to data together with the relative normalizations 
among the bins. The PDF for the component is given 
by 

'P^(mLss,nout) = 

where denotes the fixed fraction of Ni events in the 
bin and is the PDF in that NN bin with central 

value riout- 

By design, each bin contains the same number of ex¬ 
pected signal events and the bin boundaries are shown in 
Fig. [3 The number of network output bins is chosen to 
maximize the expected significance of the signal, which 


is determined in toy MC studies. The number of signal 
and fitted background events are the two free parameters 
of the fit model. The two signal channels 5+ —>■ e+t'eT 
and 5+ —>■ are measured in separate fits. A si¬ 

multaneous fit to both channels is performed to measure 
the B^ —>■ i'^i'g^ branching fraction. Lepton universality 
is assumed for the latter measurement, where the signal 
branching fractions of the two channels are fixed to the 
same value. To avoid a fit bias, all yields are uncon¬ 
strained and negative values are allowed in the fit. 

The signal component is parametrized with the sum 
of a Crystal Ball function [TH] and a Gaussian with a 
common mean. A shape for the fitted background com¬ 
ponent is given by an exponential with a polynomial in 
its argument 

fix; xo, a, P) = 

The fixed background component of i? —)■ Xui^vg decays 
is modeled with a non-parametric PDF using a kernel es¬ 
timation algorithm |19j . where each data point is repre¬ 
sented by a Gaussian and their sum yields a probability 
density function. The width of the Gaussian kernels is 
a parameter of the algorithm that is chosen to produce 
a smooth description of the MC. Identical functions are 
fitted for both signal channels. 


B. Significance and limit determination 

The significance of the signal is defined as 
y/—2\niCb / C(^s+b)) where Lb and /l(s-i-b) are the 
maximum likelihood value of the background and signal 
plus background model, respectively. The maximum 
likelihood values for null and signal hypothesis are ob¬ 
tained from the likelihood profile, where both likelihood 
values are taken from the same data distribution. An 
upper limit at 90% credibility level [T] is determined 
from an integration of the likelihood function up to the 
90% quantile, where only the range for positive signal 
yields is used. The systematic uncertainty is included 
by convolving the likelihood function with a Gaussian 
whose width is equal to the systematic error. Systematic 
errors affecting only the signal yield are included in the 
determination of the significance. The total systematic 
error, including errors impacting the overall yield, is used 
for the measurement of the branching fraction and its 
upper limit. Since the systematic errors are asymmetric, 
the downward errors are used for the significance and 
the upward errors for the upper limit. The expected fit 
results from an average over many toy MC studies are 
listed in Table |l] for the nominal and secondary analyses. 
The expected signal yield depends on the value of 
Xb- The expected fit significances are determined 
with a signal branching fraction of 5 x 10“® and the 
expected upper limits are measured without any signal 
contribution. For the simultaneous fit, a significance of 
2.9(7 including systematic errors is expected. 




Nominal analysis with > 1 GeV 


MC expectation 

Data measurement 

Mode 

Yield 

Significance {a) 

B limit (10"®) 

Yield 

B (10"®) 

Significance (a) 

B limit (10"®) 

B+ 

8.0 ± 4.5 +;;3 

2.1 

< 7.5 

n -,+ 4 . 9+1.0 
^•-*- — 3.9 —1.3 

Q q+3.0 +0.7 
«J-0_2.4 -0.9 

1.7 

< 6.1 

B+ -)■ /i+r'^7 

8.7 ± 4.6 

2.2 

< 6.9 

n q+ 3.6 +1.0 
^•y_2.6 -1.5 

0 fi+2.1+0.7 

0.4 

< 3.4 

5+ — >■ 

16.5 ± 6.5 lal 

2.9 

< 4.8 

„ „+5.7+1.6 
'-^•'-^—4.7 —2.2 

9 P,+ 1.7+0.6 
-0.7 

1.4 

< 3.5 



Secondary analysis with > 400 MeV 

MC expectation 

Data measurement 

Mode 

Yield 

Significance (a) 

B limit (10“®) 

Yield 

B (10"®) 

Significance (cr) 

B limit (10"®) 

B^ —>• e^z7e7 

B+ -)■ 

B+ 

12.4 ± 6.2 

11.9 ± 6.0 ±2,1 

24.9 ± 8.7 I®;® 

2.1 

2.2 

2.9 

< 6.8 

< 6.2 

< 4.3 

+7.0+1.8 
l-L.^-e.O -2.3 

n 1+5.2+1.7 
^'-*--4.1 -2.1 

0+8.4+3.0 

li.O-7 4 _3 5 

A q+2.9 +0.8 

4-y_2.5 -1.0 

9 0 + 1.7+0.7 
^•'J-1.5 -0.8 

2.0 

1.4 

< 9.3 

< 4.3 

< 5.1 


TABLE I. Expected signal yields obtained from MC for ^ = 5 x 10“® and measured signal yields on data, where 

the first error is statistical and the second error systematic. The significances and credibility levels contain systematic errors. 
The credibility levels are given at 90% where the expected MC limit is determined without signal. 



Nominal analysis with > 1 GeV 

Secondary analysis with > 400 MeV 

Mode 

MC expectation 

Measured yield 

MC expectation 

Measured yield 

B'^ — >• e+z7e7 

315 ±4.2 

336li^ 

668 ±6.1 

7391"® 

B+ 

348 ± 4.5 

352l^° 

714 ±6.4 

7591®® 


TABLE 11. Fitted background yields compared to the MC prediction with statistical errors only. 


C. Toy MC and sideband data checks 

The fit model is checked for a bias in extended toy MC 
studies where, pull distributions are used to quantify the 
size of the bias. The pull distributions are computed from 
the deviation from the true value divided by the fit er¬ 
ror and have a standard normal distribution for unbiased 
fits. This is used in a linearity test of the signal yield, 
which checks whether the bias of the fit results depends 
on the signal branching fractions. The pull distributions 
are in agreement with standard normal distributions, in¬ 
dicating no bias for branching fractions that result in a 
significant measurement. A test of the credible inter¬ 
val [T] coverage counts the number of events for which 
the true value is contained inside the 90% interval. For 
a branching fraction of 5 x 10“®, 95% of the true values 
are contained inside the interval; this number increases to 
more than 99% below a branching fraction of 3 x 10“®. 
Since the likelihood is only integrated for positive signal 
yields to determine the limit, the 90% quantile is moved 
to higher values. Therefore, the upper limit is a con¬ 
servative measure. The same results are found for the 
secondary analysis. 

The background MC shapes are compared to data 
in the Mbc < 5.27 GeV/c^ sideband. Additionally, 


the agreement of the input variables to the NN 
is checked in a S —)• enhanced region of 

m^iss G (0.3,1.0) GeV^/c'* and a generic background 
dominated region of G (1.0,4.0) GeV^/c"^. All con¬ 
sidered distributions agree between data and MG, except 
for the previously mentioned discrepancy in the cos 
distribution. 


VI. MEASUREMENT 

The fit results are listed in Table and the dis¬ 

tributions for the nominal analysis are shown in Fig.[^for 
both signal channels. No significant signal is found in any 
of the hts. To offer a better overview of the fit results, 
unbinned distributions of the results are shown in Fig. 
Good agreement between data and MG for the network 
output is shown in Fig. The htted background yields 
in the data are in agreement with the MC prediction, as 
shown in Table [iTj Assuming that only a few signal events 
are found below the photon energy threshold of 400 MeV, 
the partial branching fractions of the secondary analysis 
can be compared to the BaBar measurement [B] for the 
whole energy range. Limits on Ab are computed by in- 
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Source 

B+ -)■ /r+!z^7 

B'^ — >■ e^i7e7 

Fit shapes 

Meson veto network 
Fixed B —>• X^i'^ve yield 
B'^ —>• model 

+0.75 

-1.34 

±0.58 

±0.18 

-0.01 

+0.64 

-1.06 

±0.66 

±0.24 

-0.05 

Additive Error 

+0.97 

-1.47 

+0.95 

-1.27 

Lepton ID 

±0.42 

±0.18 

Tag-side efficiency 

±0.35 

±0.34 

Tag-side NN 

±0.13 

±0.40 

Tracking efficiency 

-0.01 

-0.01 

Nss 

±0.11 

±0.11 

Multiplicative Error 

±0.57 

±0.55 

Combined Error 

+ 1.12 
-1.58 

+ 1.10 
-1.39 


Source 

R+ —>■ 

Additive Error 

+ 1.64 
-2.15 

Multiplicative Error 

±0.99 

Combined Error 

+ 1.92 
-2.37 


TABLE III. Systematic uncertainties on the signal yield 
grouped by error-types for the nominal analysis with 
> 1 GeV. Deviations are given in signal yields. 


tegrating the differential decay width from Equation 



rUB /2c^ 



IGeV 


dr 

dE.y 


and solving for Xg, where the integral includes the partial 
phase space > 1 GeV up to half of the B meson mass. 
The input parameters for the differential decay width are 
taken from Ref. [3] and the value for the soft correction 
^{E^) is taken from Ref. [S]. All parameters are varied 
by their uncertainties to obtain parameter combinations 
yielding minimal and maximal values for Xg- With the 
B'^ —>■ vg'y limit of the nominal analysis, a central value 
Xg > 238 MeV is obtained at 90% credibility level. The 
limit changes within a range of As > (172,410)MeV with 
varying input parameterslj Similar values are obtained 
for the secondary analysis. 


VII. SYSTEMATIC UNCERTAINTIES 

Systematic errors are estimated in toy MC studies 
where the default and the varied fit models are applied to 


the same toy sample and the difference in signal yield is 
taken as a systematic deviation averaged over many toy 
measurements. The results are shown in Table Hill for the 
nominal analysis. 

The largest error is given by the variation of the fit 
shapes, where the Itr fit error from MC is varied. For the 
non-analytical shape obtained from the kernel estimator 
algorithm, the size of the Gaussian kernels is varied to 
obtain a considerable shape variation. 

The systematic error on the meson-veto network is ob¬ 
tained from the control channel B^ K*^^. Here, the 
signal photon candidate is combined with the remaining 
photon candidates to compute the meson mass spectra 
and obtain the network output distribution. From this 
distribution, a double ratio of data and MC is calculated 
as jg ^j^g g^g^j^. 

count in the i**' bin and Nsu-av the total number of events. 
The largest deviation between data and MC is found to 
be 8% in the most background-like network output bin. 
An alternate model is obtained by using the double ra¬ 
tio values to reweight the binned distribution in 

—)■ The angles cos0.y^ and cos0.yi,, as well 

as the remaining energy in the ECL, cannot be used in 
the NN trained on the control sample. Therefore, a sep¬ 
arate network without these variables is trained on the 
—>■ samples, which is then used to obtain the 

double ratios in the control channel. 

The fixed yields of the measured B —)■ back¬ 

grounds are varied by their world-average errors |17j . The 
systematic uncertainty related to the B'^ —>■ decay 

signal model is estimated by comparing the latest NLO 
model [3] with an older LO calculation [5D]. Here, the 
shape difference in the distribution is found to be 

small and parametric errors of the theory are also found 
to have a negligible effect on the branching fraction de¬ 
termination. 

The systematic uncertainty related to lepton ID is de¬ 
termined in 77 —)■ processes and the error is found 

to be 2.2% and 5.0% for electrons and muons, respec¬ 
tively. The error for the tag-side efficiency has been de¬ 
termined in Ref. m to be 4.2%. The error for the tag- 
side NN is taken from the sideband > 0.3 GeV^/c'^, 
where the difference in the data-MC selection efficiency 
is taken as a systematic error. Systematic deviations for 
the tracking efficiency are determined with high trans¬ 
verse momentum tracks from partially reconstructed D* 
mesons; the deviation is —0.13%. 

To obtain the systematic error for the simultaneous 
fit to both channels, all errors are assumed to be fully 
correlated except for the errors on the fit shapes and the 
lepton ID, for which no correlation is assumed. The total 
systematic error is less than half of the statistical error. 


^ Several values of ^{Ej) are calculated in Ref. [5] for different true 
values of Ag. We identify the central value of ^{E^) with the one 
obtained for \b = 300MeV. To obtain the error on ^{Ej), the 
whole range of true values for As is taken into account. 


VIII. CONCLUSION 

In summary, we report the upper limits of the par¬ 
tial branching fraction with A®'® > l GeV for semilep- 
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tonic —)■ decays with the full Belle data set of 

(771.6 ± 10.6) X lO^BB pairs. The signal photon energy 
requirement ensures a reliable theoretical description of 
the decay process. The results at 90% credibility level 
are 

B{B+ e+i^ej) < 6.1 X 10"®, 

B{B+ < 3.4 X 10"®, 

B{B+ e+iya) < 3.5 X lO"®. 

These results improve the limits measured by BaBar [^. 
The limit of the combined channel B'^ —> trans¬ 

lates into a boundary of As > 238 MeV at 90% cred¬ 
ibility level, where this limit evolves within the range 
Xb > (172,410) MeV by varying the input parameters of 
the decay width. A secondary analysis with a lower signal 
photon energy threshold of > 400MeV gives consis¬ 
tent results. 
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